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Abstract: 

We present a Monte Carlo study of a model protein with 54 amino acids that folds 
directly to its native three-helix-bundle state without forming any well-defined inter- 
mediate state. The free-energy barrier separating the native and unfolded states of 
this protein is found to be weak, even at the folding temperature. Nevertheless, we 
find that melting curves to a good approximation can be described in terms of a sim- 
ple two-state system, and that the relaxation behavior is close to single exponential. 
The motion along individual reaction coordinates is roughly diffusive on timescales 
beyond the reconfiguration time for an individual helix. A simple estimate based 
on diffusion in a square-well potential predicts the relaxation time within a factor of 
two. 
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1 Introduction 



In a landmark paper in 1991, Jackson and Fersht £[] demonstrated that chymotrypsin 
inhibitor 2 folds without significantly populating any meta-stable intermediate state. 
Since then, it has become clear that this protein is far from unique; the same be- 
havior has been observed for many small single-domain proteins |2j- It is tempting 
to interpret the apparent two-state behavior of these proteins in terms of a simple 
free-energy landscape with two minima separated by a single barrier, where the min- 
ima represent the native and unfolded states, respectively. If the barrier is high, this 
picture provides an explanation of why the folding kinetics are single exponential, 
and why the folding thermodynamics show two-state character. 

However, it is well-known that the free-energy barrier, AF, is not high for all these 
proteins. In fact, assuming the folding time Tf to be given by Tf = r exp(AF/kT) 
with To ~ 1 fis |H| , it is easy to find examples of proteins with AF values of a few kT 
[2j (k is Boltzmann's constant and T the temperature). It should also be mentioned 
that Garcia-Mira et al. [3] recently found a protein that appears to fold without 
crossing any free-energy barrier. 

Suppose the native and unfolded states coexist at the folding temperature and that 
there is no well-defined intermediate state, but that a clear free-energy barrier is 
missing. What type of folding behavior should one then expect? In particular, would 
such a protein, due to the lack of a clear free-energy barrier, show easily detectable 
deviations from two-state thermodynamics and single-exponential kinetics? Here we 
investigate this question based on Monte Carlo simulations of a designed three-helix- 
bundle protein [SIIEIIZ1- 

Our study consists of three parts. First, we investigate whether or not melting 
curves for this model protein show two-state character. Second, we ask whether the 
relaxation behavior is single exponential or not, based on ensemble kinetics at the 
folding temperature. Third, inspired by energy-landscape theory (for a recent review, 
see Refs. [HUH]); we try to interpret the folding dynamics of this system in terms of 
simple diffusive motion in a low- dimensional free-energy landscape. 
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2 Model and Methods 



2.1 The Model 

Simulating atomic models for protein folding remains a challenge, although progress 
is currently being made in this area [T m iT H IT2 | IT3* | IT H IT5 | ITT)] ' Here, for computational 
efficiency, we consider a reduced model with 5-6 atoms per amino acid in which 
the side chains are replaced by large Cp atoms. Using this model, we study a designed 
three-helix-bundle protein with 54 amino acids. 

The model has the Ramachandran torsion angles 0j, ipi as its degrees of freedom, 
and is sequence-based with three amino acid types: hydrophobic (H), polar (P) and 
glycine (G). The sequence studied consists of three identical H/P segments with 
16 amino acids each (PPHPPHHPPHPPHHPP), separated by two short GGG seg- 
ments |17|ll8j. The H/P segment is such that it can make an a-helix with all the 
hydrophobic amino acids on the same side. 

The interaction potential 

E = E\ oc + E cv + E h b + E hp (1) 

is composed of four terms. The local potential E\ oc has a standard form with threefold 
symmetry, 

^i°c = y E^ 1 + cos3 ^) + y Z^ 1 + cos3 ^) • ( 2 ) 

i i 

The excluded- volume term E ev is given by a hard-sphere potential of the form 

^^E'^) 12 ' (3) 

i<j V 

where the sum runs over all possible atom pairs except those consisting of two hy- 
drophobic Cp. The parameter is given by = oi + <jj + Aer^-, where Acr^- = 
0.625 A for CpC, CpN and CpO pairs that are connected by a sequence of three 
covalent bonds, and Acr^ = A otherwise. The introduction of the parameter Aeg- 
ean be thought of change of the local potential. 

The hydrogen-bond term E'hb has the form 

E hh = e hb u(rij)v(atij, fy) , (4) 

ij 
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where the functions u(r) and v(a,j3) are given by 



, f cos 2 a cos 2 /3 a,/3>90° //A 

= (o otherwise (6) 

The sum in Eq. HJruns over all possible HO pairs, and denotes the HO distance, 
otij the NHO angle, and the HOC angle. The last term of the potential, the 
hydrophobicity term E^ p , is given by 



i<j J ■> 

where the sum runs over all pairs of hydrophobic Cp. 

To speed up the calculations, a cutoff radius r c is used, which is taken to be 4.5 A 
for E cv and E^, and 8 A for E^ v . Numerical values of all energy and geometry 
parameters can be found elsewhere [5]. 

The thermodynamic behavior of this three-helix-bundle protein has been studied 
before jSHEj- These studies demonstrated that this model protein has the following 
properties: 



• It does form a stable three-helix bundle, except for a twofold topological degen- 
eracy. These two topologically distinct states both contain three right-handed 
helices. They differ in how the helices are arranged. If we let the first two 
helices form a U, then the third helix is in front of the U in one case (FU), and 
behind the U in the other case (BU). The reason that the model is unable to 
discriminate between these two states is that their contact maps are effectively 
very similar [T9*] . 

• It makes more stable helices than the corresponding one- and two-helix se- 
quences, which is in accord with the experimental fact that tertiary interactions 
generally are needed for secondary structure to become stable. 

• It undergoes a first-order-like folding transition directly from an expanded state 
to the three-helix-bundle state, without any detectable intermediate state. At 
the folding temperature T f , there is a pronounced peak in the specific heat. 



Here we analyze the folding dynamics of this protein in more detail, through an 
extended study of both thermodynamics and kinetics. 
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As a measure of structural similarity with the native state, we monitor a parameter 
Q that we call nativeness (the same as in jSUHllZI)- To calculate Q, we use represen- 
tative conformations for the FU and BU topologies, respectively, obtained by energy 
minimization. For a given conformation, we compute the root-mean-square devia- 
tions 5f\j and 5bu from these two representative conformations (calculated over all 
backbone atoms). The nativeness Q is then obtained as 

Q = max [exp (-(^/(lOA) 2 ) , exp (-^/(lOA) 2 )] , (8) 

which makes Q a dimensionless number between and 1. 

Energies are quoted in units of kTf, with the folding temperature Tf defined as the spe- 
cific heat maximum. In the dimensionless energy unit used in our previous study [5], 
this temperature is given by /cT f = 0.6585 ± 0.0006. 



2.2 Monte Carlo Methods 

To simulate the thermodynamic behavior of this model, we use simulated temper- 
ing in which the temperature is a dynamic variable. This method is 
chosen in order to speed up the calculations at low temperatures. Our simulations 
are started from random configurations. The temperatures studied range from 0.95 Tf 
to 1.37T f . 

The temperature update is a standard Metropolis step. In conformation space we 
use two different elementary moves: first, the pivot move in which a single torsion 
angle is turned; and second, a semi-local method [22] that works with seven or eight 
adjacent torsion angles, which are turned in a coordinated manner. The non-local 
pivot move is included in our calculations in order to accelerate the evolution of the 
system at high temperatures. 

Our kinetic simulations are also Monte Carlo-based, and only meant to mimic the time 
evolution of the system in a qualitative sense. They differ from our thermodynamic 
simulations in two ways: first, the temperature is held constant; and second, the non- 
local pivot update is not used, but only the semi- local method (231 • This restriction 
is needed in order to avoid large unphysical deformations of the chain. 

Statistical errors on thermodynamic results are obtained by jackknife analysis |21] of 
results from ten or more independent runs, each containing several folding/unfolding 
events. All errors quoted are la errors. Statistical errors on relaxation times are 
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difficult to determine due to uncertainties about where the large-time behavior sets 
in and are therefore omitted. We estimate that the uncertainties on our calculated 
relaxation times are about 10%. The statistical errors on the results obtained by 
numerical solution of the diffusion equation are, however, significantly smaller than 
this. 

All fits of data discussed below are carried out by using a Levenberg-Marquardt 
procedure fI5\ . 



2.3 Analysis 

Melting curves for proteins are often described in terms of a two-state picture. In the 
two-state approximation, the average of a quantity X at temperature T is given by 

X(T) = X ^ X » K{ ?) ( 9 ) 
1 ] 1 + K(T) ' {) 

where K(T) = P n (T)/P u (T), P n (T) and P U (T) being the populations of the native 
and unfolded states, respectively. Likewise, X n and X u denote the respective values 
of X in the native and unfolded states. The effective equilibrium constant K(T) is to 
leading order given by K(T) = exp[(l/kT — l/kT m )AE], where T m is the midpoint 
temperature and AE the energy difference between the two states. With this K(T), 
a fit to Eq. Elhas four parameters: AE, T m and the two baselines X u and X n . 

A simple but powerful method for quantitative analysis of the folding dynamics is 
obtained by assuming the motion along different reaction coordinates to be diffu- 
sive |25l27j . The folding process is then modeled as one-dimensional Brownian motion 
in an external potential given by the free energy F(r) = — kT In P oq (r), where P eq (r) 
denotes the equilibrium distribution of r. Thus, it is assumed that the probability 
distribution of r at time t, P(r,t), obeys Smoluchowski's diffusion equation 



dP(r, t) d 



dt dr 
where D(r) is the diffusion coefficient. 



dP{r,t) | P(r,t) dF(r) 



dr kT dr 



(10) 



This picture is not expected to hold on short timescales, due to the projection onto a 
single coordinate r, but may still be useful provided that the diffusive behavior sets 
in on a timescale that is small compared to the relaxation time. By estimating D(r) 
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AE/kTf 



T m /T { 




40.1 ± 3.3 
41.0 ± 2.6 
45.4 ± 3.3 
45.7 ± 3.8 
53.6 ± 2.1 



1.0050 ± 0.0020 
1.0024 ± 0.0017 
1.0056 ± 0.0017 
1.0099 ± 0.0018 
0.9989 ± 0.0008 



Table 1: Parameters AE and T m obtained by fitting results from our thermodynamic 
simulations to the two-state expression in Eq. E3 This is done individually for each of 
the quantities in the first column; the energy E, the hydrogen-bond energy E^b, the 
hydrophobicity energy Eh p , the radius of gyration R g (calculated over all backbone 
atoms), and the nativeness Q (see Eq. |SJ). The fits are performed using seven data 
points in the temperature interval 0.95 Tf < T < 1.11 Tf. 



and F(r), it is then possible to predict the relaxation time from Eq. Such an 
analysis has been successfully carried through for a lattice protein |27j . 

The relaxation behavior predicted by Eq. El is we h understood when F(r) has the 
shape of a double well with a clear barrier. In this situation, the relaxation is single 
exponential with a rate constant given by Kramers' well-known result [2H] • However, 
this result cannot be applied to our model, in which the free-energy barrier is small 
or absent, depending on which reaction coordinate is used. Therefore, we perform a 
detailed study of Eq. E3for some relevant choices of D(r) and F(r), using analytical 
as well as numerical methods. 



3 Results 



3.1 Thermodynamics 

In our thermodynamic analysis, we study the five different quantities listed in Tabled 
The first question we ask is to what extent the temperature dependence of these 
quantities can be described in terms of a first-order two-state system (see Eq. EJ). 

Fits of our data to this equation show that the simple two-state picture is not perfect 
(x 2 per degree of freedom, dof, of ~ 10), but this can be detected only because the 
statistical errors are very small at high temperatures (< 0.1%). In fact, if we assign 
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E hb /kT f RJA 




J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I L 

0.95 1.00 1.05 1.10 0.95 1.00 1.05 1.10 

T/T { T/T f 



Figure 1: Temperature dependence of (a) the hydrogen-bond energy E hh and (b) the 
radius of gyration R g . The lines are fits to Eq. E3 



artifical statistical errors of 1% to our data points, an error size that is not uncommon 
for experimental data, then the fits become perfect with a x 2 /dof close to unity. Fig.^ 
shows the temperature dependence of the hydrogen-bond energy and the radius 
of gyration R g , along with our two-state fits. 

Table ^ gives a summary of our two-state fits. In particular, we see that the fitted 
values of both the energy change AE and the midpoint temperature T m are similar 
for the different quantities. It is also worth noting that the T m values fall close to the 
folding temperature Tf, defined as the maximum of the specific heat. The difference 
between the highest and lowest values of T m is less than 1%. There is a somewhat 
larger spread in AE, but this parameter has a larger statistical error. 

So, the melting curves show two-state character, and the fitted parameters AE and 
T m are similar for different quantities. From this it may be tempting to conclude 
that the thermodynamic behavior of this protein can be fully understood in terms of 
a two-state system. The two-state picture is, nevertheless, an oversimplification, as 
can be seen from the shapes of the free-energy profiles F(E) and F(Q). Fig. El shows 
these profiles at T = Tf. First of all, these profiles show that the native and unfolded 
states coexist at T = T { , so the folding transition is first-order-like. However, there 
is no clear free-energy barrier separating the two states; F(Q) exhibits a very weak 
barrier, < 1 kT, whereas F(E) shows no barrier at all. In fact, F(E) has the shape 
of a square well rather than a double well. 
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Figure 2: Free-energy profiles at T — Tf for (a) the energy E and (b) the nativeness 
Q (dark bands). The light-grey bands show free energies for block averages (see 
Eq. IT2"|) . using a block size of Tb = 10 6 MC steps. Each band is centered around the 
expected value and shows statistical la errors. 

Phase transition terminology is, by necessity, ambiguous for a finite system like this, 
but if states with markedly different E or Q coexist it does make sense to call the 
transition first-order-like, even if a free-energy barrier is missing. At a second-order 
phase transition, the free-energy profile is wide, but the minimum remains unique. 



3.2 Kinetics 



Our kinetic study is performed at T = Tf. Using Monte Carlo dynamics (see Model 
and Methods), we study the relaxation of ensemble averages of various quantities. 
For this purpose, we performed a set of 3000 folding simulations, starting from equi- 
librium conformations at temperature T ~ 1.06 Tf. At this temperature, the chain 
is extended and has a relatively low secondary-structure content (see Fig. [TJ . 

In the absence of a clear free-energy barrier (see Fig. |2J, it is not obvious whether or 
not the relaxation should be single exponential. To get an idea of what to expect for 
a system like this, we consider the relaxation of the energy £ina potential F(E) that 
has the form of a perfect square well at T = Tf. For this idealized F(E) and a constant 
diffusion coefficient D(E), it is possible to solve Eq. El analytically for relaxation at 
an arbitrary temperature T. This solution is given in Appendix A, for the initial 
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Figure 3: Level diagram showing the deviation (in %) from a single exponential for 
diffusion in energy in a square well, based on the exact solution in Appendix A. 
The system relaxes at temperature T, starting from the equilibrium distribution at 
temperature T . p is defined as p = ((E) — E n )/AE SW , where (E) is the average 
energy at temperature T, and E n and AT SW denote the lower edge and the width, 
respectively, of the square well, p can be viewed as a measure of the unfolded popu- 
lation at temperature T, and is 0.5 if T = Tf. po is the the corresponding quantity 
at temperature T . As a measure of the deviation from a single exponential, we take 
8meLx/8E(t ), where 5 max is the maximum deviation from a fitted exponential and 
5E(to) = E(to) — (E), E(to) being the mean at the smallest time included in the fit, 
to- Data at times shorter than 1% of the relaxation time were excluded from the fit. 

condition that P(E,t = 0) is the equilibrium distribution at temperature T . Using 
this result, the deviation from single-exponential behavior can be mapped out as a 
function of To and T, as is illustrated in Fig. El The size of the deviation depends 
on both To and T, but is found to be small for a wide range of To, T values. This 
clearly demonstrates that the existence of a free-energy barrier is not a prerequisite 
to observe single-exponential relaxation. 

Let us now turn to the results of our simulations. Fig. 0] shows the relaxation of 
the average energy E and the average nativeness Q in Monte Carlo (MC) time. In 
both cases, the large-time data can be fitted to a single exponential, which gives 
relaxation times of r pa 1.7 ■ 10 7 and r ~ 1.8 • 10 7 for E and Q, respectively, in units 
of elementary MC steps. The corresponding fits for the radius of gyration and the 
hydrogen-bond energy (data not shown) give relaxation times of r pa 2.1 • 10 7 and 
t pa 1.8 ■ 10 7 , respectively. The fit for the radius of gyration has a larger uncertainty 
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Figure 4: Relaxation behavior of the three-helix-bundle protein at the folding tem- 
perature Tf, starting from the equilibrium ensemble at To ~ 1.06Tf. (a) 5E(t) = 
E(t) — (E) against simulation time t, where E(t) is the average E after t MC steps 
(3000 runs) and (E) denotes the equilibrium average (at Tf). (b) Same plot for the 
nativeness Q. 



than the others, because the data points have larger errors in this case. 

The differences between our four fitted r values are small and most probably due 
to limited statistics for the large-time behavior. Averaging over the four different 
variables, we obtain a relaxation time of r « 1.8 • 10 7 MC steps for this protein. 
The fact that the relaxation times for the hydrogen-bond energy and the radius of 
gyration are approximately the same shows that helix formation and chain collapse 
proceed in parallel for this protein. This finding is in nice agreement with recent 
experimental results for small helical proteins 

For Q, it is necessary to go to very short times in order to see any significant devia- 
tion from a single exponential (see Fig. HJ). For E, we find that the single-exponential 
behavior sets in at roughly r/3, which means that the deviation from this behavior 
is larger than in the analytical calculation above. On the other hand, for compar- 
isons with experimental data, we expect the behavior of Q to be more relevant than 
that of E. The simulations confirm that the relaxation can be approximately single 
exponential even if there is no clear free-energy barrier. 

To translate the relaxation time for this protein into physical units, we compare with 
the reconfiguration time for the corresponding one- helix segment. To that end, we 
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performed a kinetic simulation of this 16-amino acid segment at the same tempera- 
ture, T = Tf. This temperature is above the midpoint temperature for the one-helix 
segment, which is 0.95 Tf jH]. So, the isolated one-helix segment is unstable at T = Tf, 
but makes frequent visits to helical states with low hydrogen-bond energy, E hh . To 
obtain the reconfiguration time, we fitted the large-time behavior of the autocorrela- 
tion function for E hh , 

C hh (t) = (E hh (t)E hh (0)} - (E hh (0)} 2 , (11) 

to an exponential. The exponential autocorrelation time, which can be viewed as a 
reconfiguration time, turned out to be 7"h « 1.0 • 10 6 MC steps. This is roughly a 
factor 20 shorter than the relaxation time r for the full three-helix bundle. Assuming 
the reconfiguration time for an individual helix to be ~0.2/xs |H0[IH1|. we obtain 
relaxation and folding times of ~ 4 fis and ~ 8 /is, respectively, for the three-helix 
bundle. This is fast but not inconceivable for a small helical protein [2]. In fact, the 
B domain of staphylococcal protein A is a three-helix-bundle protein that has been 
found to fold in < 10 /is, at 37°C [22]. 



3.3 Relaxation-Time Predictions 

We now turn to the question of whether the observed relaxation time can be predicted 
based on the diffusion equation, Eq. For that purpose, we need to know not 
only the free energy E(r), but also the diffusion coefficient D(r). Socci et al. |2*7] 
successfully performed this analysis for a lattice protein that exhibited a relatively 
clear free-energy barrier. Their estimate of D(r) involved an autocorrelation time for 
the unfolded state. The absence of a clear barrier separating the native and unfolded 
states makes it necessary to take a different approach in our case. 

The one-dimensional diffusion picture is not expected to hold on short timescales, but 
only after coarse-graining in time. A computationally convenient way to implement 
this coarse-graining in time is to study block averages b(t) defined by 

b(t) = — r(s) t = 0,r b ,2r b ,... (12) 

Th 

i<s<t+r b 

where r b is the block size and r is the reaction coordinate considered. The diffusion 
coefficient can then be estimated using -D b (r) = ((5b) 2 ) /2r b , where the numerator is 
the mean-square difference between two consecutive block averages, given that the 
first of them has the value r. 
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Ar sw -^b 7~pred,0 ^prcd T 

~E: U0kT { (9.3 ±0.2) ■ 10" 5 (A;T f ) 2 2.1 • 10 7 1.9 • 10 7 1.7 - 10 7 
Q: 1.0 (1.00 ± 0.02) • 10^ 8 1.0 • 10 7 0.8 • 10 7 1.8 ■ 10 7 

Table 2: The predictions r pre d,o and r pre d (see text) along with the observed relaxation 
time r, as obtained from the data in Fig. for the energy E and the nativeness 
Q. Ar sw is the width of the square-well potential and is the average diffusion 
coefficient. 



In our calculations, we use a block size of Tb = 10 6 MC step, corresponding to the 
reconfiguration time Th for an individual helix. We do not expect the dynamics to 
be diffusive on timescales shorter than this, due to steric traps that can occur in the 
formation of a helix. In order for the dynamics to be diffusive, the timescale should 
be such that the system can escape from these traps. 

Using this block size, we first make rough estimates of the relaxation times for E 
and Q based on the result in Appendix A for a square-well potential and a constant 
diffusion coefficient. These estimates are given by r prc d,o = Ar^ w / 'Db^ 2 , where Ar sw is 
the width of the potential and is the average diffusion coefficient. * Our estimates 
of Ar sw and D h can be found in Table El along with the resulting predictions r prcd . 
We find that these simple predictions agree with the observed relaxation times r 
within a factor of two. 

We also did the same calculation for smaller block sizes, 7b = 10°, 10 1 , . . ., 10 5 MC 
steps. This gave r prc d,o values smaller or much smaller than the observed r, signaling 
non-diffusive dynamics. This confirms that for b(t) to show diffusive dynamics, Tt, 
should not be smaller than the reconfiguration time for an individual helix. 

Having seen the quite good results obtained by this simple calculation, we now turn to 
a more detailed analysis, illustrated in Fig. The block size is the same as before, 
r b = 10 6 MC steps, but the space dependence of the diffusion coefficient D h (r) is 
now taken into account, and the potential, F h (r), reflects the actual distribution of 
block averages. The potential Ft,(r), shown in Fig. |21 is not identical to that for the 
unblocked variables. At a first-order- like transition, we expect free-energy minima to 
become more pronounced when going to the block variables, provided that the block 
size Tb is small compared with the relaxation time, because when forming the block 
variables one effectively integrates out fluctuations about the respective states. The 



^Eq. El in Appendix A can be applied to other observables than E. The predicted relaxation 
time r predj0 is given by t x . 



13 




Figure 5: (a) Numerical solution of Eq. fTUI with the energy as reaction coordinate. 
The distribution P(E,t) is shown for t — 0, r/3, r and 2r (full lines), where r is 
the relaxation time. The dashed line is the equilibrium distribution. The diffusion 
coefficient D h (E) and the potential F h (E) (light-gray band in Fig. Ek) were both 
determined from numerical simulations, using a block size of = 10 6 MC steps (see 
Eq. H2J). (b) The space dependence of the diffusion coefficient D b (E). The band is 
centered around the expected value and shows the statistical la error. 



results in Fig. El do show this tendency, although the effect is not very strong. Fig. 
shows the diffusion coefficient D\>(E), which is largest at intermediate values between 
the native and unfolded states. The behavior of D^(Q) (not shown) is the same in 
this respect. Hence, there is no sign of a kinetic bottleneck to folding for this protein. 

Given -Db(r) and F^,(r), we solve Eq. El for P(r,t) by using the finite-difference 
scheme in Appendix B. The initial distribution P(r, t — 0) is taken to be the same as 
in the kinetic simulations. We find that the mean of P(r, t) shows single-exponential 
relaxation to a good approximation. An exponential fit of these data gives us a new 
prediction, r pre( j, for the relaxation time. 

From Table El it can be seen that the prediction obtained through this more elaborate 
calculation, r pre d, is not better than the previous one, r pre d,o, at least not in Q, despite 
that there exists a weak barrier in this coordinate (see Fig. 2b). This means that the 
barrier in Q is too weak to be important for the relaxation rate. If the underlying 
diffusion picture, Eq. HH had been perfect, r prcd would have been equal to r, as 
obtained from the kinetic simulations. Our results show that this is not the case. 
At least in Q, there are significant deviations from the behavior predicted by this 
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equation. 



If more accurate relaxation time predictions are needed, there are different ways to 
proceed. One possible way is to simply increase the block size. However, for the 
calculation to be useful, the block size must remain small compared to the relaxation 
time. A more interesting possibility is to refine the simple diffusion picture defined by 
Eq. in which, in particular, non-Markovian effects are ignored. Such effects may 
indeed affect folding times jHSlE] • Yet another possibility is to use a combination of a 
few different variables, perhaps E and Q, instead of a single reaction coordinate [Ml 
With a multidimensional representation of the folding process, non-Markovian 
effects could become smaller. 



4 Summary and Discussion 



We have analyzed the thermodynamics and kinetics of a designed three-helix-bundle 
protein, based on Monte Carlo calculations. We found that this model protein shows 
two-state behavior, in the sense that melting curves to a good approximation can be 
described by a simple two-state system and that the relaxation behavior is close to 
single exponential. A simple two-state picture is, nevertheless, an oversimplification, 
as the free-energy barrier separating the native and unfolded states is weak (< lkT). 
The weakness of the barrier implies that a fitted two-state parameter such as AE 
has no clear physical meaning, despite that the two-state fit looks good. 

Reduced jSHllIHlEIlEHlEn] and all-atom |IIliU[ni[ini[Hll2] models for sma11 helical 
proteins have been studied by many other groups. Most of these studies relied on 
so-called Go-type [13] potentials. It should therefore be pointed out that our model 
is sequence-based. 

Using an extended version of this model that includes all atoms, we recently found 
similar results for two peptides, an a-helix and a /5-hairpin ^H]- Here the calculated 
melting curves could be directly compared with experimental data, and a reasonable 
quantitative agreement was found. 

The smallness of the free-energy barrier prompted us to perform an analytical study 
of diffusion in a square-well potential. Here we studied the relaxation behavior at 
temperature T, starting from the equilibrium distribution at temperature T , for ar- 
bitrary T and T . We found that this system shows a relaxation behavior that is 
close to single exponential for a wide range of T , T values, despite the absence of a 
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free-energy barrier. We also made relaxation-time predictions based on this square- 
well approximation. Here we took the diffusion coefficient to be constant. It was 
determined assuming the dynamics to be diffusive on timescales beyond the reconfig- 
uration time for an individual helix. The predictions obtained this way were found 
to agree within a factor of two with observed relaxation times, as obtained from the 
kinetic simulations. So, this calculation, based on the two simplifying assumptions 
that the potential is a square well and that the diffusion coefficient is constant, gave 
quite good results. A more detailed calculation, in which these two additional as- 
sumptions were removed, did not give better results. This shows that the underlying 
diffusion picture leaves room for improvement. 

Our kinetic study focused on the behavior at the folding temperature Tf, where 
the native and unfolded states, although not separated by a clear barrier, are very 
different, which makes the folding mechanism transparent. In particular, we found 
that helix formation and chain collapse could not be separated, which is in accord 
with experimental data by Krantz et al. j2H] • The difference between the native and 
unfolded states is much smaller at the lowest temperature we studied, 0.95T f , because 
the unfolded state is much more native-like here. Mayor et al. j33] recently reported 
experimental results on a three- helix-bundle protein, the engrailed homeodomain 
including a characterization of its unfolded state. In particular, the unfolded state 
was found to have a high helix content. This study was performed at a temperature 
below 0.95Tf. It would be very interesting to see what the unfolded state of this 
protein looks like near Tf. In our model, there is a significant decrease in helix 
content of the unfolded state as the temperature increases from 0.95Tf to Tf. 

It is instructive to compare our results with those of Zhou and Karplus jHZj, who 
discussed two folding scenarios for helical proteins, based on a Go-type C a model. 
In their first scenario, folding is fast, without any obligatory intermediate, and helix 
formation occurs before chain collapse. In the second scenario, folding is slow with 
an obligatory intermediate on the folding pathway, and helix formation and chain 
collapse occur simultaneously. The behavior we find does not match any of these 
two scenarios. In our case, helix formation and chain collapse occur in parallel but 
folding is nevertheless fast and without any well-defined intermediate state. 



Acknowledgments 

This work was in part supported by the Swedish Foundation for Strategic Research 
and the Swedish Research Council. 



16 



Appendix A: Diffusion in a square well 



Here we discuss Eq. i n the situation when the reaction coordinate r is the energy 
E, and the potential F(E) is a square well of width AE SW at T = Tf. This means that 
the equilibrium distribution is given by P cq (E) oc exp(— 5(3E) if E is in the square 
well and P c<i (E) = otherwise, where 5(3 = 1/kT — 1/kTf. Eq. ITUlthen becomes 



dP(E,t) d 
Ft ~ dE 



(13) 



For simplicity, the diffusion coefficient is assumed to be constant, D(E) = D. The 
initial distribution P(E, t — 0) is taken to be the equilibrium distribution at some 
temperature T , and we put 5/3o = l/kT — 1/kTf. 

By separation of variables, it is possible to solve Eq. with this initial condition 
analytically for arbitrary values of the initial and final temperatures To and T, re- 
spectively. In particular, this solution gives us the relaxation behavior of the average 
energy. The average energy at time t, E(t), can be expressed in the form 

oo 

E(t) = (E) + J2^e- t/Tk , (14) 

k=l 

where (E) denotes the equilibrium average at temperature T. A straightforward 
calculation shows that the decay constants in this equation are given by 

i/n = (vrV + W&El) (15) 

sw 

and the expansion coefficients by 

A k = B k AE sw - 2 , (16) 

{nW + (6(3 - \5(3fAE^) + \6^AE^) 2 



45f3 AE sw j cosh (|(5/3 - U(3)AE SVI ) cosh \5(3AE SW if k odd 



where 

' sinh ~5(3 AE SV! \ sinh - \5(3) AE SW ) sinh \5(3AE SW if k even 1 

Finally, the equilibrium average is 

/en E n + E u 1 Ai? sw j A 
(£) = g + ^ 2~ 2^ AjE ™ ' 1 

where E n and -E u are the lower and upper edges of the square well, respectively. 
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It is instructive to consider the behavior of this solution when \5j3 — 6/3q\ <C l/AE aw . 
The expression for the expansion coefficients can then be simplified to 

A^B k AE " V W; S M AE - (19) 



with 



45f3 AE sw / cosh 2 ±<^A£ SW if k odd ,„.,. 

X i ™aiMAP ;f i. l ZU i 



sinh i<5/3 A£ sw \ sinh' |<5/3A£ SW if k even 



Note that A k scales as fc 2 if k < A£ sw , and as 1/A; 4 if fc > ^|5/3|AE SW . Note 

also that the last factor in B k suppresses A k for even k if T is close to Tf. From these 
two facts it follows that |Ai| is much larger than the other \A k \ if T is near Tf. This 
makes the deviation from a single exponential small. 



Appendix B: Numerical solution of the diffusion equation 



To solve Eq. El numerically for arbitrary D(r) and F(r), we choose a finite-difference 
scheme of Crank-Nicolson type with good stability properties. To obtain this scheme 
we first discretize r. Put rj = jAr, Dj = D(rj) and Fj = F(rj), and let p(£) be 
the vector with components Pj(t) = P{rj, t). Approximating the RHS of Eq. II 01 with 
suitable finite differences, we obtain 

^ = a pW > ( 21 ) 

where A is a tridiagonal matrix given by 

(ApW), = [D j+1/2 (p j+1 (t) - Pj (t)) - Dj-ytfaW-pj^t))] 

Let now p n = p(t n ), where t n = nAt. By applying the trapezoidal rule for integration 
to Eq. we obtain 

p n+l _ p n = At + Ap n+lJ _ (23) 

This equation can be used to calculate how P(r, t) evolves with time. It can be 
readily solved for p n+1 because the matrix A is tridiagonal. 
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